bootstrap model aggregation
Bootstrap Model Aggregation for Distributed Statistical Learning
In distributed, or privacy-preserving learning, we are often given a set of probabilistic models estimated from different local repositories, and asked to combine them into a single model that gives efficient statistical estimation. A simple method is to linearly average the parameters of the local models, which, however, tends to be degenerate or not applicable on non-convex models, or models with different parameter dimensions. One more practical strategy is to generate bootstrap samples from the local models, and then learn a joint model based on the combined bootstrap set. Unfortunately, the bootstrap procedure introduces additional noise and can significantly deteriorate the performance. In this work, we propose two variance reduction methods to correct the bootstrap noise, including a weighted M-estimator that is both statistically efficient and practically powerful. Both theoretical and empirical analysis is provided to demonstrate our methods.
Reviews: Bootstrap Model Aggregation for Distributed Statistical Learning
This is a successive work correcting previous research on using KL averaging combining subset estimators. I think the most appealing point for using KL averaging, despite the computational issue, is its power in dealing with latent variable models. There is another line of work in using geometric median to combine subset estimators the authors might want to compare to, for example, Minsker (2013) and Hsu and Sabato (2013). These algorithms are simple and efficient in most cases, but might not be doing well for latent variable models. The variance reduction technique used in this article is very similar to the de-bias technique used in Javanmard and Montanari (2015) and Lee et a. (2015), so the theoretical contribution is kind of limited.
Bootstrap Model Aggregation for Distributed Statistical Learning
In distributed, or privacy-preserving learning, we are often given a set of probabilistic models estimated from different local repositories, and asked to combine them into a single model that gives efficient statistical estimation. A simple method is to linearly average the parameters of the local models, which, however, tends to be degenerate or not applicable on non-convex models, or models with different parameter dimensions. One more practical strategy is to generate bootstrap samples from the local models, and then learn a joint model based on the combined bootstrap set. Unfortunately, the bootstrap procedure introduces additional noise and can significantly deteriorate the performance. In this work, we propose two variance reduction methods to correct the bootstrap noise, including a weighted M-estimator that is both statistically efficient and practically powerful.